Heuristic Measures of Interestingness

نویسندگان

Robert J. Hilderman

Howard J. Hamilton

چکیده

The tuples in a generalized relation (i.e., a summary generated from a database) are unique, and therefore, can be considered to be a population with a structure that can be described by some probability distribution. In this paper, we present and empirically compare sixteen heuristic measures that evaluate the structure of a summary to assign a single real-valued index that represents its interestingness relative to other summaries generated from the same database. The heuristics are based upon well-known measures of diversity, dispersion, dominance, and inequality used in several areas of the physical, social, ecological, management, information, and computer sciences. Their use for ranking summaries generated from databases is a new application area. All sixteen heuristics rank less complex summaries (i.e., those with few tuples and/or few non-ANY attributes) as most interesting. We demonstrate that for sample data sets, the order in which some of the measures rank summaries is highly correlated.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ranking the Interestingness of Summaries from Data Mining Systems

We study data rn~rdng where the task is description by summarization, the representation language is generalized relations, the evaluation criteria are based on heuristic measures of interestingness, and the method for searching is the Multi-Attribute Generalization algorithm for domain generalization graphs. We present and empirically compare four heuristics for ranking the interestingness of ...

متن کامل

Heuristic for Ranking the Interestigness of Discovered Knowledge

We describe heuristics, based upon information theory and statistics, for ranking the interestingness of summaries generated from databases. The tuples in a summary are unique, and therefore, can be considered to be a population described by some probability distribution. The four interestingness measures presented here are based upon common measures of diversity of a population: variance, the ...

متن کامل

Assessing the Interestingness of Discovered Knowledge Using a Principled Objective Approach

When mining a large database, the number of patterns discovered can easily exceed the capabilities of a human user to identify interesting results. To address this problem, various techniques have been suggested to reduce and/or order the patterns prior to presenting them to the user. In this paper, our focus is on ranking summaries generated from a single dataset, where attributes can be gener...

متن کامل

Selecting Perfect Interestingness Measures by coefficient of variation based Ranking Algorithm

Ranking interestingness measure is an active and essential research domain in the process of knowledge discovery from the extracted rules. Since various measures proposed by many researchers in various situations increases the list of measures and these are not able to use as a common measures to evaluate the rules, knowledge finders are not able to identify a perfect measure to ensure the actu...

متن کامل

DRP Report: Quantitative Association Mining From Bottom Up and Heuristic Search Perspectives

The traditional association mining focuses on discovering frequent patterns from the categorical data, such as the supermarket transaction data. The quantitative association mining (QAM) is a nature extension of the traditional association mining. It refers to the task of discovering association rules from quantitative data instead of from categorical data. The discrepancies between the two typ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Heuristic Measures of Interestingness

نویسندگان

چکیده

منابع مشابه

Ranking the Interestingness of Summaries from Data Mining Systems

Heuristic for Ranking the Interestigness of Discovered Knowledge

Assessing the Interestingness of Discovered Knowledge Using a Principled Objective Approach

Selecting Perfect Interestingness Measures by coefficient of variation based Ranking Algorithm

DRP Report: Quantitative Association Mining From Bottom Up and Heuristic Search Perspectives

عنوان ژورنال:

اشتراک گذاری